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Description 

BACKGROUND OF THE INVENTION 

5 FIELD OF THE INVENTION 

[0001] The presenl invention relates generally to methods and systems for segmenting digitally scanned documents 
into two or more planes, and more particularly to methods and systems for segmenting digitally scanned documents 
into planes suitable for a Multiple Raster Content (MRC) representation of documents. 

w 

DESCRIPTION OF RELATED ART 

[0002] The MRC representation of documents is versatile. It provides the ability to represent color images and either 
color or monochrome text. The MRC representation enables the use of multiple "planes" forthe purpose of representing 
15 the content of documents. The MRC representation is becoming increasingly important in the marketplace. It has been 
already established as the main color-fax standard. 

[0003] In an MRC representation, an image is represented by more than one image plane. The main advantage of 
the MRC representation of documents is to provide an efficient way to store, transmit, and manipulate large digital 
color documents. The method exploits the properties of the human vision system, where the ability to distinguish small 
20 color variations is greatly reduced in the presence of high-contrast edges. The edge information is normally separated 
from the smoothly varying color information, and encoded (possibly at higher resolution than 1 bit per pixel) in one of 
the planes, called the Selector plane. Following a careful separation, the various planes could be independently com- 
pressed using standard compression schemes (such as JPEG and G4) with good compression arid high quality at the 
same time. 

25 [0004] There is a need for a method and a system for efficiently separating an image 'into a set of planes, such that 
the advantages of the MRC representation can be fully exploited: 

SUMMARY OF THE INVENTION 

30 [0005] A method and a system for separating an image signal into a set of image planes are disclosed. The system 
comprises a min-max module, a dynamic threshold module and a separation module. The min-max module receives 
the image signal, searches for a minimum and maximum within at least one window centered on a current pixel in the 
image signal. The dynamic threshold module computes, forthe window, based on the respective minimum and max- 
imum received from the min-max module and the current pixel, a respective indicator representing the distance and 

35 direction of the current pixel relative to a respective threshold plane, and outputs a control signal based on the indicator. 
The separation module separates the image signal into the set of image planes in accordance with the control signal 
by including a representation of the current pixel in at least.one of the image planes. 

[0006] In one embodiment of the method defined in claim 1, operation (a) is performed for a set of windows and 
wherein operation (b) comprises the operations of: 

40 

computing, for each of the windows, a respective contrast vector 

computing, for each of the windows, a bias vector and a dot product of the respective contrast vector and a re- 
spective thresholded pixel vector representing the current pixel thresholded by the bias vector, the dot product 
representing the respective indicator; and 
45 outputting a control signal based on the respective indicators. 

[0007] In a further embodiment, for one of the windows, the bias vector represents the average between the respec- 
tive maximum and minimum. In a further embodiment, for one of the windows, the bias vector represents the average 
between a vector representing a lowpass filtered neighborhood of the current pixel and the average between the re- 
50 spective maximum and minimum. 

[0008] In a further embodiment operation (b) comprises the operations of: 

(1 ) computing a first indicator for a first window using a first logic block, the first indicator representing the distance 
and direction of the current pixel relative to a first threshold plane; 
55 (2). computing a second indicator for a second window using a second logic block, the second indicator represent- 

ing the distance and direction of the current pixel relative to a second threshold plane; 

(3) thresholding the current pixel in the image signal using a third logic block and outputfing a third logic block 
signal; and 
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(4) generating the control signals using a decision module in communication with the first, second, and third logic 
blocks, the control signals being based on the first and second indicators and the third logic block signal. 

[0009] In a further embodiment operation (1 ) further comprises the operation of computing a first activity measure 
5 indicating activity in the first window, and wherein operation (2) further comprises the operation of computing a second 
activity measure indicating activity in Ihe second window. 
[0010] In a further embodiment operation (4) comprises the operations of: 

comparing the first activity measure with the second activity measure and outputting a select signal, using a com- 
10 parator block; and 

receiving the first indicator, the second indicator and the third logic block signal, selecting arid outputting one of 
the first indicator, the second indicator and third logic block signal in accordance with the select signal, using a 
multiplexer. 

15 [0011] In a further embodiment operation (c) comprises the operations of: 

receiving the control signal and producing a selector signal, using a selector module; 
receiving the selector signal and producing a decision signal, using an edge processing modules and 
receiving the image signal and the decision signal, and outputting a foreground signal and a background signal, 
20 using a foreground/background separation module, a representation of the current pixel of the image signal being 

included in at least one of the foreground signal and the background signal in accordance with the dedsion signal. 

[0012] In a further embodiment operation (c) further comprises the operations of: 

25 receiving the foreground signal and the background signal; 

filling undefined pixels in the foreground and background signals with values computed so as to substantially 
prevent artifacts and to facilitate good compression ratio; and 
outputting a final foreground signal and a final background signal. 

30 [0013] In a further embodiment the operation of filling comprises: 

extending content of defined pixels in each of the foreground and background signals to neighboring undefined 
pixels by filling neighboring undefined pixels with diluted foreground and background values, respectively, using 
a dilate module; 

35 

averaging non-zero content of the diluted foreground and background values over minimum coded unit blocks and 
outputting averaged block values, using a block average module; and 

filling any remaining undefined pixels with the averaged block values, using a fill module. 
40 [0014] In one embodiment of the system as defined in claim 5, the min-max module searches for a minimum and 
maximum within each of a set of windows centered on the current pixel in the image signal, and wherein the dynamic 
threshold module comprises: 

a first module computing, for each of the windows, a respective contrast vector; and 
45 a second module computing, for each of the windows, a bias vector and a dot product of the respective contrast 

vector and a respective thresholded pixel vector representing the current pixel thresholded by the bias vector, the 
dot product representing the respective indicator. 

[0015] In a further embodiment, for one of the windows, the second module computes the bias vector by computing 
the average between the respective maximum and minimum. 

[0016] In a further embodiment, for one of the windows, the second module computes the bias vector by computing 
the average between a vector representing a lowpass filtered neighborhood of the current pixel and the average be- 
tween the respective maximum and minimum. 
[0017] In a further embodiment the threshold module comprises: 

(1) a first logic block computing a first indicator for the first window, the first indicator representing the distance 
and direction of the current pixel relative to a first threshold plane; 

(2) a second logic block computing a second indicator for the second window, the second indicator representing 
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the distance and direction of the current pixel relative to second threshold plane; 

(3) a third logic block thresholding the current pixel in the image signal and outputting a third logic block signal: and 

(4) a decision module generating the control signals in communication with the first, second, and third logic blocks, 
the control signals being based on the first and second indicators and the third logic block signal. 

5 

[0018] In a Turther embodiment the first logic block computes a Tirst activity measure indicating activity in the first 
window, and wherein the second logic block computes a second activity measure indicating activity in the second 
window. 

[0019] In a further embodiment the decision module comprises: 

10 

a comparator block comparing the first activity measure with the second activity measure and outputting a select 
signal; and 

a multiplexer receiving the first indicator, the second indicator and the third logic block signal, selecting and out- 
putting one of the first indicator, the second indicator and third logic block signal in accordance with the select signal. 

15 

[0020] In a further embodiment the separation module comprises: 

a selector module receiving the control signal and producing a selector signal; 
an edge processing module receiving the selector signal and producing a decision signal; 
20 a foreground/background separation module receiving the image signal and the decision signal, and outputting a 

foreground signal and a background signal, a representation of the current pixel of the image signal being included 
in at least one of the foreground signal and the background signal in accordance with the dedsion signal. 

[0021] In a further embodiment the separation module further comprises: 

25 

a cleanup module receiving the foreground signal and the background signal, filling undefined pixels in the fore- 
ground and background signals with values computed so as to substantially prevent compression ringing artifacts 
and to facilitate good compression ratio, and outputting a final foreground signal and a fnal background signal. 

30 [0022] In a further embodiment the cleanup moduJe comprises: 

a dilate module extending content of defined pixels in each of the foreground and background signals to neighboring 
undefined pixels by filling neighboring undefined pixels with diluted foreground and background values, respec- 
tively; 

35 

a block average module averaging non-zero content of the diluted foreground and background values over mini- 
mum coded unit blocks and outputting averaged block values; and 

a fill module filling any remaining undefined pixels with the averaged block values. 

40 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0023] The features and advantages of the present invention will become apparent from the following detailed de- 
scription of the present invention in which: 

45 

FIG. 1 illustrates the MRC structure for documents. 

FIG. 2 shows the block diagram of the system of the present invention. 

so FIG. 3 shows the block diagram of an embodiment of the system of the present invention. 

FIG. 4 illustrates the function of the Dependent Min-Max block E1 used in one: embodiment of the system of the 
present invention. 

55 FIG. 5 illustrates the function of the Dependent Min-Max Sub-Sample block E2 used in one embodiment of the 

system of the present invention. 

FIG. 6 illustrates the functions of the Dependent Max block E3 and Dependent Min block E4 used in one embod- 
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iment of the system of the present invention. 

FIG. 7 illustrates the two window contexts employed by one embodiment of the Dynamic Threshold module. 

5 FIG. 8 shows the block diagram of one embodiment of the Dynamic Threshold module. 

FIG. 9 shows an implementation of the comparator logic block included in one embodiment of the Dynamic Thresh- 
old module. 

10 FIG. 1 0 shows the truth table of the comparator logic block of FIG. 9. 

FIG. 11 shows an implementation of the selector logic module included in one embodiment of the Dynamic Thresh- 
old module. 

'5 FIG. 12 illustrates the function of the Edge Processing block included in the Separation module. 

FIG. 13 illustrates the decision range used by the Separation module for separating the image signal into the 
Background and Foreground planes. 

20 FIG. 14 shows a block diagram of one implementation of the FG/BG Cleanup block included in one embodiment 

of the Separation module. 

FIG. 15 illustrates the dilate operation used in one implementation of the FG/BG Cleanup block included in one 
embodiment of the Separation module. 

25 

FIG. 16 is a graphical illustration of equations (1) through (4). 

FIG. 17 is a graphical illustration of equations (6) through (9). 

30 FIG. 1 8 shows an exemplary structure of the halftone estimate module. 

FIG. 19 shows a min-max detection scheme used by the min-max detection modules included in the halftone 
estimate module of FIG. 18. 

35 FIG. 20 illustrates the equations that implement the halftone weight module included in the halftone estimate mod- 

ule. 

DETAILED DESCRIPTION OF THE INVENTION 

-to [0024] The present invention provides a method and a system for separating an image signal into a set of image 
planes. The image signal represents a digitally scanned document. The image planes are suitable for a Mixed Raster 
Content (MRC) representation of the digitally scanned document. 

[0025] FIG. 1 shows the general MRC representation. The representation comprises up to four independent planes: 
Foreground, Background : Selector, and Rendering Hints. In the most general case, there could be multiple Foreground 
and Selector pairs at higher levels. However, in most applications, the representation is limited to three or four planes. 
The Background plane is typically used for storing continuous-tone information such as pictures and/or smoothly var- 
ying background colors. The Selector plane normally holds the image of text (binary) as well as other edge information 
(e.g., line art drawings). The Foreground plane usually holds the color of the corresponding text and/or line art. However, 
the MRC representation only specifies the planes and their associated compression methods It does not otherwise 
restrict nor enforce the content of each of the planes. The content of each of the planes may be defined appropriately 
by an implementation of the MRC representation. 
[0026] The MRC structure also allows for a fourth plane, the Rendering Hints plane, which is used for communicating 
additional information about the content of the document. For example, the Rendering Hints plane may carry the ICC 
(International Color Consortium) color hints that identify the best color matching strategy for the various objects on the 
55 page. 

[0027] The Foreground and Background planes are defined to be two full-color (L, a, b) planes. The Selector plane 
is defined as a binary (1-bit deep) plane. The Rendering Hints plane is typically restricted to an 8-bit plane. One 
exemplary MRC representation specifies that the Foreground and Background are to be JPEG compressed, and that 
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the Selector plane is to be ITU-G4 compressed (standard 'Group 4 facsimile compression). The Rendering Hints plane 
is considered to be optional, but if one is used, a compression scheme similar to the Lempel-Zev-Welch scheme may 
be used tor its compression. In general, the Foreground, Background. Selector and Rendering Hints planes can all be 
at different resolutions, and they are not required to maintain the original source input resolution. 

5 [0028] The method for assembling back a "segmented" MRC image from its components (i.e. , planes) is by "pouring" 
the Foreground colors through the Selector plane "mask" on lop of the Background plane, thus overwriting ihe previous 
content of the Background plane at these locations. In other words, the assembly is achieved by multiplexing between 
the Foreground and Background information on a pixel by pixel basis, based on the binary control signal of the Selector 
plane. For example, if the Selector value is 1 , the content of Foreground is used; otherwise (i.e., for Selector value = 

10 0) the content of Background is used. The multiplexing operation is repeated on a pixel by pixel basis until all of the 
output pixels have been defined. 

[0029] The main advantage of the MRC representation of documents is to provide an efficient way to store, transmit, 
and manipulate large digital color documents. The method exploits the properties of the human vision system, where 
the ability to distinguish small color variations is greatly reduced in the presence of high-contrast edges. The edge 
15 information is normally separated from the smoothly varying color information, and encoded (possibly at higher reso- 
lution than 1 Selector sample per source pixel) in the Selector plane. Following a careful Separation, the various planes 
could be independently compressed using standard compression schemes (such as JPEG and G4) with good com- 
pression and high quality at the same time. 

[0030] The Segmentation system of the present invention is used for splitting an incoming image into three or more 

20 planes suitable for an MRC representation of the image. 

[0031] FIG. 2 shows a block diagram of the Segmentation system of the present invention. Segmentation system 
200 comprises a Min-Max module 202, a Dynamic Threshold module 204 and a Separation module 206. The Min-Max 
module 202 receives the image signal DSC, searches for minima and maxima within a set of windows centered on a 
pixel in the image signal. The Dynamic Threshold module 204 computes, for each of the windows, based on the minima 

25 and maxima received from the min-max module and the current pixel, a respective indicator representing the distance 
and direction of the current pixel relative to a respective threshold plane, and outputs a control signal based on the 
indicators. The Separation module 206 separates the image signal into the set of image planes in accordance with the 
control signal by including a representation of the current pixel in at least one of the image planes. 
[0032] FIG. 3 shows a block diagram of one embodiment 300 of the Segmentation system 200. 

30 [0033] For best performance of the Segmentation system 300, the input signal DSC should be free of most of the 
original mid-frequency halftone patterns of the original scanned image. These halftone frequencies are typically elim- 
inated by passing the input image through a de-screen system first. However, in some situations, such as for clean 
PDL (Page Description Language) printing, the input signal may be known to be free of problematic halftone frequen- 
cies. In such situations, the de-screen operation is not needed and the clean input signal can be directly fed into the 

35 Segmentation system. 

[0034] For ease of explanation, in the description of the Segmentation system 300 herein, the source input image 
DSC, as well as the Foreground FQ and Background BQ outputs, are all assumed to be full-color (L, a, b) planes, 
while the Selector plane SEL output is binary (1-bit). It is understood that these assumptions are not to be construed 
as limitations of the applications of the present invention. 
-w [0035] In general, the Foreground, Background, and Selector planes could all be at different resolutions relative to 
the input image DSC. For example, the Foreground and Background planes are typically down-sampled (for better 
compression) while the Selector plane is typically up-sampled (for better edge quality) from the original input resolution. 
The amount of up or down sampling may be fully programmable under software control. 

[0036] The Segmentation system 300 may also receive and use the optional estimated frequency Halftone Weight 
45 HTW and full color Super Blur BLR_A signals when they are available. These optional signals may be generated by 
a de-screen or filtering system such as the one described in a co-pending patent application. The optional full color 
'Super Blur BLR A signal may be generated by lowpass filtering the image source signal with a filter that has a very 
large filter span (i.e., very low cut-off frequency). The optional estimated frequency Halftone Weight HTW will be de- 
scribed in detail later in connection with FIG. 18, FIG. 19 and FIG. 20. 
50 [0037] The Segmentation system 300 comprises a Min-Max module 310, a Dynamic Threshold module 320, and a 
Separation module 330. 

[0038] The Min-Max module 310 comprises a Dependent Min-Max block E1, a Dependent Min-Max Sub-Sample 
block E2 and two Dependent Min-Max blocks E3, E4. The Min-Max module 310 receives the input image signal DSC 
(3-dimensional), computes and outputs two sets of maximum and minimum vectors (Mx, Mn), (MX, WIN), each set 
55 corresponding to a different window. 

[0039] The Dynamic Threshold module 320 receives the input image signal DSC, and the vectors (Mx, Mn), (MX, 
MN) from the Min-Max module 310 and computes, for each of the windows, based on the respective minimum and 
maximum received from the min-max module and the current pixel, a respective indicator representing the distance 
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and direction of the current pixel relative to a respective threshold plane, and outputs based on the indicators a control 
signal GRS to the Separation module 330. The optional control signals SEG. ENH may also be outputted. The Dynamic 
Threshold module 320 also receives the optional estimated trequency Halftone Weight HTW and full color Super Blur 
BLR_A signals when they are available. 

5 [0040] The Separation module 330 comprises a Selector Logic block E6. an Edge Processing block E7, a FG/BG 
Separation block E8, and a FG/BG Cleanup block E9. The Separation module 330 receives the image signal DSC, 
the vectors Wlx, Mn Tram ihe Min-Max module 310, the control signal GRS and the optional control signals SEG, ENH 
from the Dynamic Threshold module 320, andoutputs the three signals BG, FG, SEL, which correspond to the Back- 
ground, Foreground, and Selector planes of an MRC representation of the image DSC, respectively. 

10 [0041] The Dependent Min-Max block E1 receives the input image signal DSC, and searches in a 5x5 window cen- 
tered on the current pixel of interest for the minimum value (vector) Win and maximum value (vector) Mk. Tho vectors 
Mn and Mx represent the minimum and maximum in the window context of 5x5 pixels. The meaning of those rectors 
will be described in detail later. 

[0042] The Dependent Min-Max Sub-Sample block E2 receives the input image signal DSC, and searches for the 

15 minimum and maximum luminance value in each of the non-overlapping 8x8 windows, and also provides the corre- 
sponding chroma values at these locations. By using non-overlapping 8x8 windows, the Dependent Min-Max Sub- 
Sample block E2 effectively sub-samples the minimum and maximum values by a factor 1 of 8 in each direction, thus 
reducing the overall bandwidth by a factor of 64. The sub-sampled outputs are then fed to the two Dependent Min- 
Max blocks E3 and E4, which search for the minimum and maximum vectors WIN and MX over a 9x9 window centered 

20 on the original (before sub-sampling) 8x8 window that contains the current pixel of interest. Thus, the WIN and WIX 
vectors correspond to the minimum of all the minima and the maximum of all the maxima from the non-overlapping 
8x8 windows, respectively. Due to the sub-sampling (by 8) effect, the 9x9 window actually corresponds to a window 
context of 72x72 pixels. It is noted that capital letters are used for vectors WIN and WIX to distinguish them from the 
vectors Win and Wix (outputs of block E1) and to indicate that they represent the minimum and maximum in the larger 

25 window context of 72x72 pixels overall. 

[0043] The two sets of minimum and maximum vectors (Win, Wlx) and (WIN, MX) are fed to the Dynamic Threshold 
Module 320. The Dynamic Threshold Module 320 outputs the monochrome 8-bit signal GRS whose biased zero cross- 
ings represent the locations of edges in the Selector planes. In addition, The Dynamic Threshold Module may also 
generate the optional binary control signal SEG and the optional 8-bit segmentation enhancement control ENH. The 

30 optional binary control signal SEG provides an external means (similar to that of an override switch) to control the 
segmentation operation of the FG/BG Separation block E8 of Separation module 330 (see equations (14) through 
(20)). The optional 8-bit segmentation enhancement control ENH provides to the FG/BG Separation block E8 the 
amount of enhancement to apply. 

[0044] The Selector Logic block E6 receives the 8-bit Gray Selector signal GRS from the Dynamic Threshold Module 
35 320, up-samples it by doubling the resolution, and then thresholds it at the zero crossings to produce the binary Selector 
plane output SEL. For high-quality text and line-art reproduction, the Selector plane is typically kept at twice the input 
resolution (1 200 dpi for a 600 dpi input), although it could be programmed for even higher ratio (in one implementation, 
up to 8 times the input resolution) under software control. 

[0045] But in applications that do not require very high quality, the Selector plane could be at the same resolution 
40 as the input signal DSC. The Edge Processing block E7 receives the high resolution Selector output SEL and counts 
the number of ON and OFF pixels in a 5x5 (high-resolution) window centered on the current (low-resolution) pixel of 
interest. The Edge Processing block E7 outputs the two-bit signal SEE. The SEE signal is set to 0 if all of the input 
pixels inside the 5x5 window are OFF (corresponding to a 5x5 constant Background area). Similarly, the SEE signal 
is set to 3 if all of the input pixels inside the window are ON (corresponding to a 3x3 constant Foreground area). The 
45 SEE output is set to 1 or 2 if the 3x3 window is mostly Background (white) or mostly Foreground (black), respectively. 
[0046] The FG/BG Separation block E8 receives the full color source signal DSC to be segmented, the full color 
minimum and maximum vectors Mn, Wlx from the Dependent Min-Max block E1 , the SEE signal from the Edge Process- 
ing block E7, the optional segmentation signal SEG, and the enhancement control signal ENH from the Dynamic 
Threshold Module 320. The FG/BG Separation block E8 performs the MRC segmentation to generate the Foreground 
so and Background information , and produces two full-color outputs Fgr and Bgr as the rough estimates of the Foreground 
and Background planes, respectively. 

[0047] The FG/BG Cleanup block E9 applies additional processing on the rough Foreground and Background esti- 
mates Fgr and Bgr to generate the final Foreground and Background outputs FG and BG. This additional processing 
is to slightly extend the Foreground and Background values beyond the edges and to fill in the undefined pixels in the 
55 Foreground and Background planes with appropriate values. The purpose of this processing is to prevent artifacts that 
may result from a subsequent sampling and JPEG compression and to fill in the yet-undefined pixels with values that 
will result in good JPEG compression ratio. 

[0048] An additional logic inside the FG/BG Cleanup block E9 (see Tile Tag block F7 of FIG. 14) also monitors the 
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Foreground and Background output values to detect and flag tiles that are almost all-black or all-white. Rather than 
encode the output from such tiles into the output file, a special tile marker is used and referenced whenever such a 
tile is detected. This increases the overall compression ratio by eliminating the need to repeatedly encode the common 
all-white or all-black tiles. 

5 [0049] The blocks included in the Min-Max module 310 will be discussed in detail In the following. 

[0050] The Dependent Min-Max block E1 looks for the maximum and minimum values of the luminance component 
L in a 5x5 window centered on the current pixel of interest, and outputs the full-color (luminance and chrominance) 
values at these locations. It is called a Dependent Min-Max to indicate that it only searches for the minimum and 
maximum over a single component, which is the luminance L ; and not over all three components of the image signal 

io DSC. Once the locations of the minimum and maximum luminance are found, the chroma components (a, b) at these 
locations arc also outputtcd. 

[0051] The Dependent Min-Max block E1 outputs two vectors of full-color (L f a, b) signals Mn = (L Mn , a Mn , b Mn ), and 
Mx = (L Mx , a Mx b Mx ), corresponding to the minimum and maximum values in the 5x5 window, respectively. The outputs 
Mn and Mx are at the same pixel rate as the input signal DSC. 

15 [0052] FIG 4 illustrates the operation of the Dependent Min-Max block E1 . The content of the DSC luminance data 
is first searched in a 5x5 luminance window centered on the current pixel of interest to find the locations of the smallest 
and largest L values. If the minimum or maximum L values are not unique (that is, if there is more than one location 
having the same minimum or maximum value), the location of the one first encountered is used. The output of this 
search process is a unique pair (L Mn L Mx ) of the minimum and maximum L values as well as their relative location 

20 within the 5x5 window. 

[0053] The Dependent Min-Max block E1 then uses the relative location information to index the corresponding 
chroma (a, b) components in the two corresponding 5x5 chroma windows and retrieve the chroma values at these 
locations. Thus, the relative location of the maximum L value L Mx is used to address the 5x5 chroma windows and 
retrieve the chroma pair (a Mx , b Mx ) at this location. Together, the triplet (L Mx a Mx b Mx ) forms the output ftflx from the 
25 Dependent Min-Max block E1. Similarly, the relative location of the minimum L value L Mn is used to address the 5x5 
chroma windows and retrieve the chroma pair (a Mn , b Mn ) at this location. The triplet (L^, a Mn , b Mn ) forms the output 
Win from the Dependent Min-Max block E1 . 

[0054] The implementation of the Dependent Min-Max block E1 can be greatly accelerated by taking advantage of 
the sequential nature of the operation and the type of operations (min-max) that is being performed. For example, as 

30 the operation is advanced to the subsequent pixel, the extreme values (i.e., maximum and minimum) and corresponding 
locations for the previous pixel are already known. Since the current 5x5 window greatly overlaps the previous window, 
by keeping track of the previous window content, the Dependent Min-Max block E1 has to sort out only the newest 
and oldest 5x1 columns of L values on either side of the previous window. The center 3x5 area is common to both 
previous window and current window, and the new address locations of the previous minimum and maximum values 

35 in the previous window are at an offset of 1 in the fast scan direction relative to their previous locations. The previous 
minimum and maximum are compared to values in the newest column of L values to yield the new maximum and 
minimum L values. 

[0055] FIG. 5 illustrates the Dependent Min-Max Sub-Sample block E2. Block E2 receives the full-color (L, a ; b) input 
signal DSC and produces two full-color sub-sampled minimum and maximum outputs 502 and 504. Block E2 searches 
*o for the minimum'and maximum luminance values over non-overlapping 8x8 windows. The locations of the minimum 
and maximum luminance values are then used to index the chroma windows and retrieve the corresponding chroma 
values at these locations. 

[0056] By using non-overlapping 8x8 windows, the operation of the Dependent Min-Max Sub-Sample block E2 is 
effectively sub-sampling the min and max outputs (that would' have been produced had a sliding window been used 
45 instead of non-overlapping windows) by a factor of 8 in each direction, thereby reducing the overall output data rate 
by a factor of 64. 

[0057] The minimum output 504 corresponds to the triplet (L MIN , a MIN , b MIN ) formed by the minimum luminance value 
L M)N of the input signal DSC within the 8x8 window containing the current pixel of interest, and the corresponding 
chroma (a, b) values (a MIN , b M)N ) at this minimum luminance location. Similarly, the maximum output 502 corresponds 
50 to the triplet (L MAX , a MAX , b MAx) formed by the maximum luminance value L MAX of the input signal DSC within the 8x8 
window containing the current pixel of interest, and the corresponding chroma (a, b) values (a MAX , b MAX ) at this max- 
imum luminance location. If the minimum or maximum luminance values are not unique (i.e., if there is more than one 
location with the same maximum or minimum values), the one first encountered is used. 

[0058] The sub-sampling operation is achieved by advancing the current pixel position by 8 in the fast scan direction 
55 (and also, upon reaching the end of a line, by 8 lines in the slow scan direction) to maintain the non-overlapping windows 
condition. 

[0059] The 8-times (abbreviated as 8x) reduction factor (in each dimension) of the Dependent Min-Max Sub-Sample 
E2 block is designed in accordance with the amount of sub-sampling desired for the Foreground and Background 
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planes (normally a sub-sampling factor of 2). For higher output image quality (as is the case with a clean PDL input 
image, for example), it may be desirable to not sub-sample the Foreground and Background outputs at all. In such a 
case, a smaller amount of sub-sampling factor (e.g., only 4x) is to be applied instead of the 8x factor above. If a sub- 
sampling factor of 4 (for each direction) is to be applied, 4x4 non-overlapping windows are used. 
5 [0060] The Dependent Min-Max Sub-Sample block E2 is used in conjunction with the two Dependent Min and Max 
Unils E3 and E4 lo produce a Min-Max analysis similar to lhat of the Dependent Min-Max block E1 , but covering a 
much larger area context (72x72 pixels as compared lo 5x5 pixels) and at a coarser resolution to reduce the overall 
bandwidth. 

[0061] FIG. 6 illustrates the functions of the Dependent Max block E3 and Dependent Min block E4 as used in one 

10 embodiment of the system of the present invention; 

[0062] The Dependent Max block E3 receives the full-color dependent maximum output 502 from the Dependent 
Min-Max Sub-Sample block E2, searches the content of the luminance data in the signal 502 in a 9x9 luminance window 
centered on the current pixel of interest to find the location of the maximum L value. If the maximum. L value is not 
unique (that is, if there is more than one location having the same maximum value), the location of the one first en- 

15 countered is used. The output of this search process is the maximum value L MX as well as its relative location within 
the 9x9 window. 

[0063] The Dependent Max block E3 then uses the relative location information of L MX to index the corresponding 
chroma (a, b) components in the two corresponding 9x9 chroma windows and retrieve the chroma values at this location. 
Thus, the relative location of the maximum L value L MX is used to address the 9x9 chroma windows and retrieve the 
chroma pair (a MX , b Mx ) at this location (as illustrated in FIG. 6). The triplet (L MX , a MX , b MX ) forms the output MX of the 
Dependent Max block E3. 

[0064] The Dependent Min block E4 receives the full-color dependent minimum output 504 from the Dependent Min- 
Max Sub-Sample block E2, searches the content of the luminance data in the signal 504 in a 9x9 luminance window 
centered on the current pixel of interest to find the location of the minimum L value. If the minimum L value is not unique 
(that is, if there is more than one location having the same minimum value), the location of the first one encountered 
is used. The output of this search process is the minimum value L MN as well as its relative location within the 9x9 window. 
[0065] The Dependent Min block E4 then uses the relative location information of L MN to index the corresponding 
chroma (a, b) components in the two corresponding 9x9 chroma windows and retrieve the chroma values at this location. 
Thus, the relative location of the minimum L value L MN is used to address the 9x9 chroma windows and retrieve the 
chroma pair (a MN , b MN ) at this location (as illustrated in FIG. 6). The triplet (L MN , a MN , b MN ) forms the output RAN of the 
Dependent Min block E4. 

[0066] By applying the Dependent Min block E4 on the dependent minimum output 504 of the Dependent Min-Max 
Sub-Sample block E2, the dependent minimum operation is effectively extended over a larger area to provide a de- 
pendent minimum analysis (the MN is minimum of minima received from block E2). Similarly, the Dependent Max block 
effectively provides a dependent maximum analysis over the extended area (the MX is maximum of maxima received 
from block E2). Since both inputs 502 and 504 are already sub-sampled by a factor of 8 in each direction (as compared 
to the original pixel resolution of input image DSC), the equivalent window area for each of the dependent minimum 
MN and maximum MX is 72x72 pixels at the original pixel resolution. 

[0067] The Dynamic Threshold module 320 applies adaptive thresholding to the incoming source signal DSC to 
generate a monochrome 8-bit gray signal GRS output, whose zero crossings represent the edges in the Selector plane . 
The Dynamic Threshold module 320 utilizes the two sets of m in/max values (Mn, Mx) and (MN, MX) from the 5x5 flne- 
and 9x9 coarse resolution windows, and may also receive the Halftone Weight estimate HTW and the Super Blur 
BLR_A signals, when they are available. The Dynamic Threshold module 320 produces the gray selector signal GRS, 
the binary segmentation signal SEG and the 8-bit signal ENH, which is used to communicate the amount of segmen- 
ts tation enhancement to apply in the FG/BG Separation block E8. 

[0068] FIG. 7 illustrates the three available choices of context area: the Single Pixel area which is the area of the 
current pixel of interest, the 5x5 High-Resolution window W1 . and the 9x9 Coarse Resolution window W2. Recall that 
the 9x9 window context W2 corresponds to a window of 72x72 pixels sub-sampled by 8 in each direction. Each square 
(pixel) in the 9x9 coarse resolution window W2 represents an extremum in a window of 8x8 original pixels (i.e., pixels 
so at the original pixel resolution): The Dynamic Threshold module 320 uses these three predefined context areas in the 
process of determining the gray selector signal GRS. 

[0069] The Single Pixel (current pixel) area is used when no contrast activity (described below) exists in both the 
5x5 window W1 and 9x9 window W2, in which case the luminance of the incoming signal DSC is merely thresholoed 
and the chroma (a, b) components are not used. Otherwise, the 5x5 High-Resolution and 9x9 Coarse Resolution areas 
are used in combination to track and segment the incoming signal DSC based on the level of activity in the windows. 
Activity in the 5x5 window indicates the presence of an image edge in that window. Activity in the 9x9 window indicates 
that an edge is either approaching the small window or leaving the small window. Thus, the large 9x9 window serves 
as a look ahead feature. It also provides the history of where an edge has been. This allows proper setting of the SEE 
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signal (to be described later). The large 9x9 window could be replaced by other embodiment that serves the same 
purposes. The operation of tracking and segmenting the incoming signal DSC based on the level of activity in the 
windows will be described below. 

[0070] FIG. 8 shows a block diagram of an embodiment 800 of the Dynamic Threshold Module 320. The embodiment 

5 800 comprises three logic blocks 810, 820, 830, and a decision module 840. 

[0071] The three logic blocks 810, 820, 830 correspond to the three possible context windows shown in FIG. 7, i.e., 
the Single Pixel area, the 5x5 High-Resolulion window W1, and the 9x9 Coarse Resolution window W2, respectively. 
[0072] The multiplexer MUX can select and pass one of these outputs as the final GRS output signal. The selection 
can be switched on a pixel-by-pixel basis based on the 2-bit signal SEL.The actual selection code for each of the inputs 

10 is shown in FIG. 8 to the right of the input arrows. 

[0073] For the case of a Single Pixel context, the luminance component of the incoming input signal DSC is merely 
biased by subtracting from it a pre determined 8-bit constant THR, using the adder 81 S. The value of THR is stored in 
a programmable register so that it could be adjusted to accommodate the sensor calibration. For an ideal balanced 
incoming signal DSC that spans the full 8-bit luminance range, THR would be normally set to THR ^ 128 in order to 

15 bias the luminance of DSC such that the output signal GRS will have zero mean and the incoming signal will be 
thresholded halfway across. However, the visual threshold may well be skewed away from the center due to the log- 
arithmic response of the human visual system. 

[0074] In addition, the scanner response may vary across the dynamic range, or may not even span the full 8-bit 
range. For example, the peak luminance value is determined by the brightest media reflectance, and the dark current 
20 of the sensor determines the output at low light levels. The value of the threshold register THR can be appropriately 
adjusted to account for the above considerations and better match the desired GRS response. In any case : only the 
luminance component of the incoming signal DSC is used for this biasing. 

[0075] The logic block 820 is used to address the 9x9 coarse resolution window context W2 shown in FIG. 7. The 
inputs to the logic block 820 are the full-color coarse minimum value MN and maximum value MX from the Dependent 
25 Max and Min blocks E3 and E4, respectively. Recall that these values were generated by sub-sampling the outputs of 
the Dependent Min-Max block E1 by a factor of 8 in both drections and then searching for the minimum and maximum 
(i.e. . minimum of minima and maximum of maxima) over a 9x9 window. The operation of the logic block 820 is equivalent 
to performing the scaled dot product of the following two vectors X and Y: 

30 

output 828 = < X, Y > ; (1) 
where < X, Y > Is the scaled dot product of the two vectors X and Y: 

< X, Y > = (X L X a X b ) (Y u Y a> Y b )* = <X L Y L + X a Y a + X b Y b ; (2) 

where 
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Y = DSC-(MX + MN)/2 = 
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[0076] The (L, a, b) values in equation (4) are the corresponding color components of the incoming signal DSC. The 
X vector In equation (3) is the vector difference between the maximum value MX and the minimum value MN. The Y 
vector in equation (4) is the incoming signal DSC minus the average of the minimum WIN and maximum MX values, 
the average being the 3D midpoint between MN and MX. By taking the scaled dot product of these two vectors, the 
5 output is proportional to the relative distance from the plane that is perpendicular to the X vector and crosses it halfway 
along. Since the sought-after information is the location of the zero-crossing, the precise magnitude of the dot product 
is not required. Therefore, the result is divided by an arbitrary factor of 256 (shift right by 8) to scale it back to fit the 
8-bit range. 

[0077] However, since the logic block 820 output (to multiplexer 848) may still occasionally overflow the 8-bit range 
10 (by a factor of roughly 3, or 1 .5 bits), additional logic may be used to limit the logic block 820 output to 255 if it gets 
larger than 255. 

[0078] A scalar measure for the overall contrast magnitude X9 within the coarse resolution 9x9 window is generated 
by adding together the absolute values of the three components of the vector X within the summation block 829: 
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X g = L x + la x l 4- lb x l = L MX - L MN + la MX - a MN l + lb MX - b MN l ; (5) 



[0079] Referring to equation (5), there is no need to take the absolute value of the luminance component L since L 
is confined to the positive range [0. 255]. The implementation of equations (1) through (5) for the logic block 820 is 

20 straight-forward. Referring to logic block 820 in FIG. 8, the first two adders 821, 823 perform the vector sum and 
difference of the 3x1 input signals MX, MN, on a component by component basis. The adder 821 that handles the sum 
also divides the result by 2 (by shifting it right by 1 position) to obtain the average as indicated by the symbol >V2. 
Adder 823 outputs the vector difference X (defined in equation (3)) to block 829. Block 829 computes the sum of 
absolute values of the three components of the vector X and generates the contrast magnitude X Q . Adder 825 calculates 

25 the vector Y in equation (4) by performing the vector difference between the input signal DSC and the output from 
adder 821 . The X and Y vector components are then multiply-and-added together, element by element, to form the 
dot product in the dot product block 827. The output 828 of block 827 is described by equations (1) and (2). 
[008O] FIG. 16 is a graphical illustration of equations (1) through (4). In FIG. 16, the origin of the three-dimensional 
space is assumed to be on the left hand side, as shown. The vectors MX, MN represent the three-dimensional points 

30 MX and MN, respectively. The current image pixel is represented by the vector DSC. As illustrated, the vector X = (MX 
- MN) and the vector Y resuft from vector operations in accordance with equations (3) and (4). The value d 1 represents 
the result of taking the dot product of X and Y. This value is the projection of Y onto X. This value also indicates the 
distance and "direction" of the point represented by the vector Y with respect to the plane P 1 . The plane P 1 is orthogonal 
to the vector X = MX - MN at the midpoint of X. By "direction" of the point represented by vector Y, it is meant whether 

35 this point is above or below the plane P 1 . The plane P 1 represents the threshold plane. The indicator value d 1 indicates 
whether, after thresholding, the current image pixel DSC is above or below the threshold plane, that is, whether it is 
closer to MX or to MN, and by how much. This indicator value d 1 allows a decision to be made regarding segmentation 
of the current pixel. For example, if the thresholded pixel is very close to MX (respectively, MN), a decision can be 
made that the current pixel be included in the Foreground plane (respectively, Background plane). If the thresholded 

40 pixel is too close to the threshold plane, a decision can be made that the current pixel be included in both the Foreground 
and Background planes. 

[0081] Referring to FIG 8, the logic block 830 is used to address the 5x5 high-resolution window context W1 shown 
in FIG. 7. The inputs to the logic block 830 are the full-color minimum and maximum values Mn, Mx from the Dependent 
Min-Max module E1 . The operation of the logic block 830, in forming a scaled dot product, is similar to the logic block 
45 820 described above. 

[0082] The operation of the logic block 830 is equivalent to performing the scaled dot product of the following two 
vectors: 



output 838 = <X\ Y' > ; (6) 
where <X* , Y* > is the scaled dot product between the two vectors X' and Y': 

< X' , Y' > = (X L \ X a \ X b ') (Y L \ Y B \ Y b ') 1 = X L ' Y L ' + X a ' Y a * + X b ' Y b ' ; (7) 

where 
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[0083] The (L, a, b) values in equation (9) are the corresponding color components of the incoming signal DSC. The 
X' vector in equation (8) is the vector difference between the maximum vector Mm and the minimum vector Win. The 
Y* vector in equation (9) is the incoming signal DSC minus the average of the minimum Win and maximum Wix values, 
20 the average being the 3D midpoint between Win and Wix. By taking the scaled dot product of these two vectors, the 
output is proportional to the relative distance from the plane that is perpendicular to the X' vector and crosses it halfway 
along. Since the sought-after information is the location of the zero-crossing, the precise magnitude of the dot product 
is not required. Therefore, the result is divided by an arbitrary factor of 256 (shift right by 8) to scale it back to fit the 
8-bit range. 

25 [0084] However, since the logic block 830 output (to multiplexer 848) may still occasionally overflow the 8-bit range 
(by a factor of roughly 3, or 1 .5 bits), additional logic may be used to limit the logic block 830 output to 255 if it gets 
larger than 255. 

[0085] A scalar measure for the overall contrast magnitude X5 within the fine resolution 5x5 window W1 (FIG. 7) is 
generated by adding together the absolute values of the three components of the vector X' within the summation block 
30 839: 



X 5 = L x . 



+ la x .l + lb x ,l = L Mx 



L Mn + ,a Mx " a Mn' + lb Mx - b Mn'; 
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[0086] Referring to equation (10), there is no need to take the absolute value of the luminance component L since 
L is confined to the positive range [0. .255]. The implementation of equations (6) through ( 1 0) for the logic block 830 is 
straight-forward. Referring to logic block 830 in FIG. 8, the first two adders 831 , 833 perform the vector sum and 
difference of the 3x1 input signals Mx, Mn, on a component by component basis. The adder 831 that handles the sum 
also divides the result by 2 (by shifting it right by 1 position) to obtain the average as indicated by the symbol 1/2. Adder 
833 outputs the vector difference X' (defined in equation (8)) to block 839. Block 839 computes the sum of absolute 
values of the three ccnponents of the vector X' and generates the contrast magnitude X 5 . Adder 834 adds the vector 
signal BLR_A to the vector output of adder 831 and divides the result by 2. Adder 835 calculates the vector Y* in 
equation (9) by performing the vector difference between the input signal DSC and the output from adder 834. The X' 
and Y' vector components are then multiply-and-added together, element by element, to form the dot product in the 
dot product block 837. The output of block 837 is described by equations (6) and (7). 

[0087] It is important to note that the architecture of logic block 830 differs from that of logic block 820 by having the 
added threshold-biasing feature that enhances dark or light thin lines by "nudging" the threshold towards the Super- 
Blur reference signal BLR_A = (L A , a A , b A ) when BLR_A is available. This is accomplished by averaging the Super 
Blur signal BLR_A with the averaged Wix and Win values, to form the alternative Y' vector, as shown in equation (9). 
[0088] FIG. 17 is a graphical illustration of equations (6) through (9). In FIG. 17. the origin of the three-dimensional 
space is assumed to be on the left hand side, as shown. The vectors Wix, Win represent the three-dimensional points 
Mx and Win, respectively. The current image pixel is represented by the vector DSC. The Super-Blur reference signal 
BLR_A is represented by the vector BLR_A. As illustrated, the vector X' = (Wix-Win) and the vector Y' result from vector 
operations in accordance with equations (8) end (9). The value d 2 represents the result of taking the dot product of X' 
and Y\ This value is the projection of Y' onto X'. This value also indicates the distance and "direction" of the point 
represented by the vector Y' with respect to the plane P 2 . The plane P 2 is orthogonal to the vector X = WIX - WIN at a 
point away from the midpoint by a small amount. This amount represents the added threshold-biasing feature discussed 
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in the preceding paragraph. By "direction" of the point represented by vector Y\ it is meant whether this point is above 
or below the plane P 2 . The plane P 2 represents the threshold plane. The indicator value d 2 indicates whether, after 
thresholding, the current image pixel DSC is above or below the threshold plane, that is, whether it is closer to Mx or 
to Mn, and by how much. This indicator value d 2 allows a decision to be made regarding the segmentation of the 
current pixel. For example, if the thresholded pixel is very close to ftflx (respectively, Mn), a decision can be made that 
the current pixel be included in the Foreground plane (respectively, Background plane). If the thresholded pixel is loo 
close to the threshold plane, a decision can be made that the current pixel be included in both the Foreground and 
Background planes. 

[0089] Referring to FIG 8, the decision module 840 receives the output 818 from logic block 810, output 828 and 
contrast magnitude output X 9 from logic block 820, output 838 and contrast magnitude output X 5 from output block 830. 
[0090] The decision module 840 comprises a comparator logic 846, a multiplexer 848, an enhancement coding block 
850, and a comparator 852. The decision module 840 also includes two parameterized picccwisc linear function blocks 
842 and 844 to process the halftone weight signal HTW when it is available from a de-screener system. 
[0091] The comparator logic 846 receives the contrast magnitude outputs X g and X 9 , outputs the select signal SEL 
to control the output GRS of multiplexer 848, and outputs the enable signal ENA to control the enhancement signal 
ENH of the enhancement logic 850. The comparator logic 846 may also use the 8-bit Hatftone Weight frequency es- 
timate HTW, when available, from a de-screener system, after the estimate HTW has passed through a parameterized 
piecewise linear function block 842. 

[0092] Note that, by definition of the min-max operations described previously, the contrast magnitude of the larger 
9x9 (sub-sampled) window W2 must be equal or larger than the contrast magnitude of the smaller 5x5 high-resolution 
window W1 . In other words: 

X 9 >X 5 ; 

[0093] This is due to the fact that, for a larger window that includes a smaller one, the maximum can only be larger 
and the minimum smallerthan those of the smaller window. Furthermore, as the segmentation process proceeds from 
one pixel to the next (in the fast scan direction), the X 9 contrast value remains the same for 8 consecutive pixels until 
the next pixel crosses the 8x8 window boundaries into the next non-overlapping window. The X 5 contrast value, on 
the other hand, may change on a pixel by pixel basis. This behavior is due to the 8x sub-sampling performed by the 
Dependent Min-Max Sub-Sample block E2. 

[0094] FIG 9 shows the block diagram of an embodiment of the comparator logic 846. The two contrast magnitude 
measures X 5 and X 9 are compared to the signal STH, via comparators 904, 902, respectively, to generate the selection 
bits SEL0 and SEL1 , respectively. The bits SEL0 and SEL1 form the 2-bit select signal SELECT. If the halftone weight 
HTW is available, HTW is passed through the piecewise linear function block 842 to produce STH. Otherwise, STH is 
set to a predetermined value. The two bits SEL0 and SEL1 are then combined together by AND gate 906 to generate 
the 1-bit enhancement enable signal ENA. 

[0095] FIG. 10 shows the equivalent Truth Table for the comparator logic. If the contrast measure X 9 of the larger 
9x9 (sub-sampled) window W2 is smaller than STH, then, regardless of the contrast measure X 5 of the smaller 5x5 
window W1 , the SEL1 bit is cleared and the SELECT signal is either 0 or 1 . This causes the multiplexer 848 to select 
the Single Pixel context output 818 (FIG. 8). If, however, there is some activity in the larger 9x9 window W2 but not 
within the smaller 5x5 window W1 , the SELECT signal is set to equal 2 (binary "1 0 H ). This causes the multiplexer 848 
to select in the logic block 820 output 828. If both windows show significant contrast magnitude, the SELECT signal 
is set to 3, resulting in the output 838 of logic block 830 (corresponding to the 5x5 high-resolution window) being 
selected by the multiplexer 848. In addition, when the SELECT signal is 3, the binary enable signal ENA is turned on. 
The signal ENA is used to enable the enhancement block 850 to output the segmentation enhancement signal ENH. 
[0096] Referring to FIG. 8; the enhancement coding block 850 also uses a linear function of the Halftone Weight 
frequency estimate HTW to produce the signal ENH which controls the amount of segmentation enhancement to be 
applied in the FG/BG Separation block E8 (FIG. 3). The HTW signal is fed to the parameterized piecewise linear 
function block 844 which applies a piecewise linear function EEN to the signal HTW, and outputs the resulting signal 
to the enhancement coding block 850. The binary enhancement enable signal ENA from the comparator logic 846 is 
used for gating (i.e., enabling) the enhancement signal ENH as follows. If ENA = 1, then the block 844 output signal 
is passed through to the output ENH; otherwise, all of the ENH bits are forced to zero (disabled). The 8-bit ENH output 
signal communicates the amount of segmentation enhancement to be applied in the FG/BG Separation block E8. 
[0097] Referring to FIG. 8, the comparator 852 compares the optional halftone weight HTW with a predefined thresh- 
old THR SEG and generates the optional binary signal SEG. The optional binary signal SEG provides a means to control 
the segmentation function of the FG/BG Separation block E8. The binary signal SEG is turned on when HTW is larger 
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than THR SEQ . When SEG is on. the segmentation in the FG/BG Separation block E8 is performed in accordance with 
equations (14), (15), (16), (17). The binary signal SEG is turned off when HTW is smaller than or equal to THR SEQ . 
When SEG is off, the segmentation in the FG/BG Separation block E8 is performed in accordance with equations (18), 
(19), (20). The value of the 8-bit THR SEG may be programmable via a register. 

5 [0098] Referring to FIG. 3, the Selector Logic block E6 converts the 8-bit grayscale selector input GRS to the binary 
Selector plane output SEL which could be at a different resolution relative to the input image signal DSC. For high- 
quality text and line art, the binary Selector output SEL is typically kept at a higher resolution than the Background and 
Foreground planes For example, the embodiment 300 uses twice the resolution, so that, for a standard 600-dpi scan- 
ner, the output resolution of the binary SEL signal is normally set at 1 200 dpi. The Selector Logic block E6 interpolates 

10 the gray selector input GRS to the higher resolution and thresholds the resulting interpolated signal to generate the 
binary output SEL which forms the Selector plane. 

[0099] FIG. 11 shows the block diagram of the Selector Logic block E6. The Selector Logic block E6 comprises an 
interpolator 1102 and a comparator 1104. The interpolator 1102 interpolates the 8-bit gray selector input signal GRS 
in the gray domain using a two-dimensional bilinear interpolation scheme. The interpolation factor may be program- 
is mable under software control. In the embodiment 1102 shown in FIG. 11 (and FIG. 3), the interpolation factor is set to 
the default setting of 2x. The output from the interpolator 1 1 02 is fed to the comparator 1 1 04, which thresholds it against 
a value TSL. and outputs the resulting binary output as the Selector signal SEL. The value of the threshold TSL is 
stored in a programmable register so that that it can be changed from one page of a document to the other. For an 
ideal signed gray selector signal GRS that spans the full 8-bit luminance range, the value TSL would be normally set 
20 to the mid-point TSL = 0 to threshold the incoming signal halfway across. However, as explained previously for the 
signal THR in block 810 (FIG. 8), the actual TSL value may be set differently in order to compensate for the actual 
dynamic range of the scanner sensor and/or the properties of the human visual system. 

[0100] Note that for the current 2x interpolation-factor, the binary SEL output data rate is twice as fast as the image 
signal DSC in each direction. This means that for every 8-bit input GRS sample, the Selector Logic block E6 produces 

25 four binary output SEL samples. 

[0101] It is important to note that a vector notation is used for the output SEL to indicate the higher output resolution. 
While the output SEL is still considered to be binary (that is. assuming the values of either 0 or 1 only), each incoming 
GRS input generates four Selector bits at the output (assuming the normal 2x interpolation factor). The output data 
rate is half the input rate, since one 8-bit value is converted to four 1-brit values. 

30 [0102] Referring to FIG. 3 ; the Edge Processing block E7 receives the high resolution Selector output SEL and 
counts the number of on and off pixels in a 5x5 high-resolution window centered on the current pixel of interest. The 
Edge Processing block E7 outputs the two-bit signal SEE. The SEE signal is set to 0 if all of the input pixels inside the 
window are 0 (corresponding to a constant 3x3 original input resolution Background area). Similarly, the SEE signal 
is set to 3 if all of the input pixels inside the window are 1 (corresponding to a constant 3x3 original input resolution 

35 Foreground area). In addition , the SEE output is set to either 1 or 2 if the content of the 3x3 window is mostly Background 
(white) or mostly Foreground (black), respectively. 

[01 03] FIG. 1 2 illustrates the function of the Edge Processing block E7. The operation of block E7 is as follows. The 
Edge Processing block receives as input the binary Selector signal SEL, which is at higher resolution (normally 2x) 
relative to the source input signal DSC. The Edge Processing block E7 maintains a 3x3-pixel context window W3 (at 

•to the original input resolution) centered at the current pixel of interest. Since the Selector signal SEL is at twice the 
original resolution, each of the elements in the 3x3-pixel window W3 contains four of the high-resolution binary SEL 
samples as shown in FIG. 1 2. The thicker lines represent the original DSC resolution, which corresponds to a 6x6-pixel 
context window in the high-resolution domain. However, only the dot content of 5x5 high-resolution pixel area W4 
(shown in dashed lines) is used; the area between W3 and W4 is excluded from the edge processing. 

45 [0104] The 5x5 high-resolution context W4 is designed for "detecting" potential edges in the proximity of the current 
pixel of interest. The window pattern W4 uses a full context of 2 high-resolution pixels extending below and to the right 
of the current pixel, but only one high resolution pixel extending to the top or to the left. This window pattern prevents 
any edge from overlapping with the neighboring pixels, that is, no potential edge location could be detected (i.e., shared) 
more than once, in association with multiple lower resolution pixels. The 4x4=16 possible edge locations within the 

50 current window of interest are indicated in FIG. 12. 

[0105] The Edge Processing block E7 counts the number of high-resolution pixels that are currently turned on in the 
5x5 high-resolution area W4. This number can range from 0 to 25. It is mapped to the output 2-bit signal SEE of the 
Edge Processing block as following: 

55 1 . SEE = 0 if the 5x5 count was 0 (no Foreground pixels found); 

2. SEE = 1 if the count was in the range [1 ..12] inclusive (mostly Background pixels); 
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3. SEE - 2 if the count was in the range [13.. 24] inclusive (mostly Foreground pixels); 

4. SEE = 3 if the count was 25 (only Foreground pixels found); 

5 [0106] The 2-bit output signal SEE is forwarded to the FG/BG Separation block E8. It is noted that this signal is at 
the original input resolution (typically, 600 dpi), although the Selector signal is at the higher resolution of double the 
original input resolution (for this example, 1200 dpi). 

[0107] The FG/BG Separation block E8 splits the incoming image signal DSC into the Foreground and Background 
planes The block uses the full-color minimum and maximum (Win, Mx) outputs from the Dependent Min-Max block 
10 E1, as well as the 2-bit Selector edge count signal SEE from the Edge Processing block E7. In addition, the FG/BG 
Separation block can also enhance the lifting of edges via the segmentation enhancement control signal ENH received 
from the Dynamic Threshold Module 320. As discussed previously, the segmentation process in the FG/BG Separation 
block E8 can be controlled by the optional binary signal SEG. 

[0108] The FG/BG Separation block E8 outputs two full-color rough initial estimates Fgr and Bgr of the Foreground 
15 and Background outputs, respectively. The subsequent module, the FG/BG Cleanup block E9, further processes the 
estimates Fgr and Bgr to ... produce the final Foreground and Background outputs 

[0109] The FG/BG Separation block E8 takes in the full-color source signal DSC to be segmented and produces 
values for one of the Fgr and Bgr outputs, or sometimes for both. The FG/BG Separation block E8 reserves the special 
code of zero luminance and chroma (L = a = b = 0) to indicate an empty (undefined) pixel in either the Foreground 
20 estimate Fgr or the Background estimate Bgr. As the process continues across the page, some of the Foreground 
and Background pixels will remain undefined. The FG/BG Cleanup block will then carefully fill in the values for these 
undefined pixels to keep the compression low and to prevent potential subsequent sub-sampling and JPEG compres- 
sion artifacts. 

[01 10] FiG. 1 3 illustrates the decision range used by the FG/BG Separation block E8 for separating the image signal 
25 into the Background and Foreground planes. The FG/BG Separation block uses the value of the 2-bit Selector edge 
count signal SEE from the Edge Processing block E7 to determine whether to use the Background, Foreground, or 
both. The Foreground is used if SEE > 1, and the Background is used if SEE < 2. Since these two cases are not 
mutually exclusive, both the Foreground and Background are used simultaneously when 1 < SEE < 2. 
[0111] The following two full-color quantities (the purpose of which will be detailed below) are defined: 

30 

FGD = LIM[DSC + (Mn - DSC)(ENH/255)] ; (12) 



BGD = LIM[DSC + (Mx - DSC)(ENH/255)] ; (13) 

where DSC is the full-color input signal, Mn and Mx are the dependent minimum and maximum outputs from the 
Dependent Min-Max block E1, and ENH is the 8-bit segmentation enhancement signal from the Dynamic Threshold 
Module 320. The limit function LIM is used to restrict the result to the 8-bit range [1 . 255] for each of the components, 
thus excluding the special code of zero that is reserved for marking undetermined pixels. Note that since DSC and Mn 
or Mx are full-color (L, a, b) vectors, the operations are in 3D space. 

[0112] For the case of using the Foreground during segmentation, that is, when SEE -{1,2, 3} and SEG = 1 , the 
output Fgr value is determined to be: 

Fgr = FGD, if SEG=1 AND SEE = {1 , 2, 3} ; (14) 



Fgr =0 if SEG=1 AND SEE = 0 ; (15) 

50 

[0113] When the control signal SEG is on (SEG=1) and the Foreground is not used (i.e., when SEE = 0), the Fore- 
ground pixel value is marked as undefined by setting its value to the special code Fgr = 0 (for all of the three components) 
per equation (15). Note that, in one implementation, ENH is extended to a 9-bit representation and its value is incre- 
mented by 1 to allow normalization by 256 instead of 255, thus replacing the unnecessary division with a simple shift 
55 right by 8 positions. 

[0114] As shown in equations (14) and (12), the Foreground estimate Fgr value is interpolated (in 3D space) between 
the current input signal value DSC and the minimum value Mn of the Dependent Min-Max block E1 , depending on the 
amount of segmentation enhancement expressed by the 8-bit signal ENH. When ENH = 0, no enhancement will be 
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done and the output will be set to the input signal Fgr = DSC. As previously discussed, this would be the common 
case unless there would be sufficient contrast activity in both the 5x5 fine resolution window W1 and 9x9 coarse res- 
olution window W2. Equivalent^, when ENH = 255 (corresponding to maximum enhancement), the output will be set 
to the minimum signal Fgr = Mn. This would normally represent the case of a pixel in the immediate vicinity of an edge, 
s where it would be advantageous to enhance the edge by painting the Foreground darker by as much as possible, as 
given by the minimum value in the neighborhood (0=black). In general, the amount of segmentation enhancement ENH 
can vary between the above two extremes, and the output Foreground values will be correspondingly weighted between 
the DSC and Mn values. 

[0115] Similarly, for the case of using the Background during segmentation, i.e., when SEG = 1 and SEE = {0, 1 . 2}, 
10 the output Bgr value is determined by the following equations: 

Bgr=BGD if SEG=1 AND SEE={0, 1,2}; (16) 
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Bgr=0 if SEE =3; (17) 



[01 1 6] The output Bgr value will vary between the input DSC and Mx values in proportion to the amount of segmen- 
tation enhancement ENH, as given by equation (13). Equation (16) is similar to equation (14), with the exception of 
20 using the maximum Mx instead of the minimum Mn (see also equations (12) and (13)), and the different range of SEE. 
The use of Mx for the Bgr output will make it become visually lighter rather than darker as was the case for the use of 
Mn for the Foreground estimate Fgr. 

[0117] Also, as indicated by equation (1 7) and corresponding with (1 5), when the Background is not used (i.e., when 
SEE = 3), the Background value is marked as undefined by setting its value to the special code Bgr = 0 (for all of the 
25 three components L, a, b). 

[0118] In the case where the optional segmentation control is turned off (i.e., SEG =0), the Background is arbitrarily 
set to be a special code, regardless of the input pixel value: 



Bgr=0 : ifSEG = 0; (18) 
and the Foreground is set to be one of the following: 

Fgr = BGD, if SEG=0 AND SEE = {0; 1} ; (1 9) 

Fgr = FGD , if SEG=0 AND SEE = {2, 3} ; (20) 

40 [0119] Equations (18), (19), (20) show that the Background is always set to the special undefined marker Bgr = 0, 
and the Foreground takes on the value of the weighted Foreground or Background pixel of (12) or (13), depending on 
the value of SEE. If SEE = {0,1}, the Background is used, otherwise the Foreground used. 

[0120] It is important to note that the range of SEE for this case, where the optional segmentation control is turned 
off, is different than what was used in equations (14) through (17). Here the Foreground value is simply set to FGD if 

45 most of the pixels are foreground ( SEE ={0, 1) ), or to BGD if they are mostly background ( SEE ={2, 3}). 

[0121] The output from the FG/BG Separation block E8 is the two partially filled full-color planes Fgr and Bgr. Away 
from the edges in the Selector plane SEL, typically only one of the Foreground or Background outputs will contain the 
current pixel color (possibly enhanced), depending on whether it was light or dark. But near the edges, however, in- 
formation might be carried in. both the Foreground and Background channels simultaneously. 

so [0122] The FG/BG Cleanup block E9 applies additional processing on the rough Foreground and Background esti- 
mates Fgr and Bgr to generate the final Foreground and Background outputs FG and BG. The processing done by 
the FG/BG Cleanup block E9 is to slightly extend the Foreground and Background values beyond edges and also fill 
the undefined pixels in the Foreground and Background planes. The purpose of this processing is to prevent artifacts 
that may result from a subsequent sub-sampling and data compression (such as JPEG) and fill in the undefined pixels 

55 to achieve good compression ratio. 

[01 23] An additional, logic inside the FG/BG Cleanup block E9 also monitors the Foreground and Background output 
values to detect and flag constant all-black or all-white tiles. Rather than encode the output from such tiles into the 
output data stream, a special tile marker is used and referenced as many times as needed. This will increase the overall 
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compression ratio by eliminating the need to repeatedly encode the common all-white or all-black tiles. 
[01 24] FIG. 1 4 shows a block diagram of an embodiment of the FG/BG Cleanup block E9. The embodiment comprises 
two Dilate blocks F1 , F2, two Averaging blocks F3, F4, two Fill / Sub-Sample blocks F5, F6, and a file Tag block F7. 
[0125] The Dilate blocks F1 and F2 extend the defined content of the Foreground and Background inputs by two 
5 pixels. The extension by 2 pixels is done in anticipation of a subsequent 2x sub-sampling; other extension sizes could 
be used for different sub-sampling factors. The purpose is to increase the Foreground and Background overlap over 
edges so that the outcome is less sensitive to further sub-sampling. 

[0126] FIG. 15 illustrates the effect of the dilate operation of Dilate blocks F1 and F2. The area 1 504 represents the 
defined pixels. The area 1502 represents the undefined pixels. After the dilate operation, the area of defined pixels is 
io extended to cover also area 1506, and the area of undefined pixels is reduced to area 1508. 

[0127] The operation of the Dilate blocks is achieved by using lowpass filters Fz_n. The n suffix indicates the number 
of the filter coefficients. In one embodiment, two-dimensional 5x5 triangular filters Fz_5 arc used. The z suffix indicates 
that such filter eliminates any zero terms from the total normalization weight. 
[0128] The general Fz_n filter equation is as follows: 

15 

Za» * w,j 

output value - — » ^ ; (21) 

20 where a fJ are the 2D input values, w,j are the 2D fitter coefficients, and 5(a,j) is defined as follows: 

o(a,j) = 1 if a,j * 0 ; otherwise 8(ay) = 0 ; (22) 

25 [01 29] As can be seen from equation (21 ), an Fz_n filter differs from a regular filter in that the total weight is no longer 
just a known normalization constant. Since the number of incoming pixels having the value of zero is not known in 
advance, an ongoing accumulator for the weight must be kept. At the same time that the filter loop is operating, the 
content of the accumulator is increased by the value of the current filter coefficient if the corresponding input value is 
non-zero. Since the total weight is not fixed and known in advance, the final normalization of the filter output is dependent 
on the value of the total weight. However, it is still possible to avoid the division operation in equation (21) by using a 
predetermined multiplication table with multiple choices for the possible total weight values. 

[0130] The purpose of using the special Fz_n filters is to eliminate any pixels marked as undefined with the special 
zero marker from contaminating the filtered output. Since the special marker value was selected to be zero, the unde- 
fined pixels do not affect the numerator in (17). However, the number of these undefined pixels must be tracked in 

35 order to keep the denominator accurate. 

[0131] Whenever the filtered output of the Fz_5 filter in the Dilate block F1 (or F2) is non-zero (that is, at least one 
pixel within the 5x5 window centered on the current pixel of interest is non-zero, or fully defined), it replaces the special 
zero marker used for the undefined pixels. Note that the filtered outputs of the Fz_5 filters in the dilate blocks F1 , F2 
are only used for previously undefined pixels, and only affecting pixels within a two-pixel area in the immediate vicinity 

40 of an edge. 

[0132] Referring to FIG. 14, the Averaging blocks F3 and F4 average the non-zero content of the diluted Foreground 
and Background values over JPEG MCU (Minimum Coded Units) blocks. These blocks have a size of 16 x 16 for a 
non sub-sampled output, and 32 x 32 if the output is sub-sampled by a factor of 2 in each direction (other sizes may 
be used for different sub-sampling factors). Again, the averaging is carefully done to exclude any undefined pixels that 
might otherwise bias the result. The averaging operation could be viewed in a similar manner to using Fz_n filters but 
with constant unity weight rather than triangular shapes. The Block Averages are performed on the fixed non-overlap- 
ping JPEG MCU block grid. 

[0133] The Fill /Sub-Sample blocks F5 and F6 substitute the averaged Background and Foreground values received 
from the F3 and F4 units for the values of any remaining undefined pixels in the diluted Background and Foreground 
50 channels, respectively. With these substituted averaged values, the compression ratio is improved and further JPEG- 
ringing artifacts are largely prevented. Since the Background and Foreground channels are typically sub-sampled, the 
Fill / Sub-Sample blocks F5 and F6 also perform the sub-sampling operation as needed, and output the Background 
and Foreground signals BG, FG. 

[0134] The Tile Tag block F7 monitors the values of the Foreground and Background channels over tiles and flags 
any tiles that effectively have all white or all black values. The tile sizes are programmable, but typical values vary from 
64 x 64 to 512 x 512 pixels. A tile is considered to be ail white if the absolute chroma values (i.e., both lal and Ibl) of 
each and every, pixel within the tile are smaller than given thresholds, and the luminance is larger than a given threshold. 
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Similarly, a tile is considered to be all black if the absolute chroma values (i.e., both lal and Ibl), as well as the luminance 
value, are all smaller than the given thresholds. The three thresholds above may be programmable using registers. 
[0135] The following describes how the optional signal HTW can be generated from a Halftone Estimator module. 
[0136] The Halftone Estimator module is used for measuring the frequency and halftone weight in a small area 
surrounding the current pixel of interest. The inputs to the Halftone Estimator module are the source signal SRC and 
an output BLR_3 from a small lowpass filter (e.g , having 3 coefficients). Both of these inputs are full-color (L, a, b) 
signals. 

[0137] The Halftone Estimator module produces two monochrome (single-channel) output signals, FRQ and HTW, 
representing the estimated frequency and halftone weight, respectively. Each of these signals is represented by an 
8-bit representation. HTW represents the level of confidence in the area being a halftone. If HTW is small (low confi- 
dence), active segmentation is turned off to prevent lifting of each and every halftone dot. 
[0138] FIG. 18 is a block diagram of an embodiment of the Halftone Estimator modulo. 

[0139] As shown in FIG. 18, the embodiment of the Halftone Estimator module comprises two separate frequency 
detection channels operating concurrently, the outputs of which are combined together at the very last step to generate 
the halftone weight. Each frequency channel comprises a Min-Max texture detector (C2 and D2), followed by cascaded 
averaging filters. Most of the averaging filters also sub-sample the data by a factor of 2 at the same time (that is, only 
produce every other filtered value), so that the peak bandwidth is greatly reduced. Only at the very end the data is up- 
sampled back to the original resolution. 

[0140] The notation in FIG. 18 is intended to emphasize the similarity between the two frequency channels by drawing 
the various blocks as matching pairs connected by the dotted lines (there is no actual connection). 
[0141] Matching pairs of blocks are given the same block number, with the initial letter C for the original frequency 
channel and D for the blurred frequency channel. Matching pairs are therefore designated as (Cn, Dn). with n=[2,... : 9J. 
[0142] The notation used for the various block names is as following: the first number (underscored) denotes the 
size of the window used; the second number following the slash denotes the amount of sub-sampling that is done 
inside the block in each direction. Thus, for example, a filter designated B_3/2 indicates a blurring (lowpass) filter B 
with 3x3 window size, whose output is sub-sampled by a factor of 2 in both directions (that is, only one output is sent 
for every 2x2 = 4 input pixels). 

[0143] The following provides a more detailed description of the various blocks included in the Halftone Estimate 
module. 

[0144] The Sub-Sampling unit D1 is reducing the data rate by a factor of 4 by omitting every other pixel and line of 
the input. The sub-sampling is only needed in the blurred frequency channel D. No sub-sampling is needed for the full- 
resolution frequency channel C. The input to the SS/2 unit is the full-color (L, a, b) output signal BLR_3 from a small 
lowpass filter. The output of the SS/2 unit is a full-color (L, a, b) signal. The preferred sub-sampling method is to apply 
a simple pre-filter prior to sub-sampling in order to eliminate potential frequency aliasing problems. 
[01 45] The two identical Min-Max Detection modules C2, D2 are used for finding peaks and valleys in the input signal. 
By counting the number of peaks and valleys per unit area, a measure of the local frequency is obtained. 
[0146] Each of the two Min-Max Detection units is taking as input a full-color (L, a ; b) signal. Each unit utilizes three 
3x3 windows to indicate when the center pixel of one of the color components is at an extreme value (either peak or 
valley) relative to its 8 neighbors, in accordance with the logic described below. 

[0147] Each color component is independently examined in its own 3x3 window. The output from each Min-Max 
Detection unit is a 2-bit signal indicating the total number of color components that are in an extreme value state. This 
number could vary from zero (no color component was at an extremum) to 3 (all color components are at an extremum). 
When one or two color components are at an extremum. no distinction is made as to which ones are at an extremum; 
only the total number of components at an extremum is outputted 

[0148] FIG. 19 shows the Min-Max Detection structure. For each color component, the outer ring of 8 pixels sur- 
rounding the center pixel (the current pixel of interest) is first analyzed. The 8 outer pixels are further divided into two 
sets of 4 pixels each as shown in FIG. 19. The partitioning of the outer ring into two sets is useful for reducing the 
likelihood of false alarms in detecting straight-line segments as halftones (since most commonly encountered halftones 
are typically classified as clustered dots or line- screens). 

[0149] For each set, the pixel values are compared among the members of the set to determine the minimum and 
maximum values within each set independently: 



A, 



max 



^ max ( Aij ) ; over all (i, j) belonging to the set A ; 



(23) 
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A mjn - min ( Aij ) ; over all (i, j) belonging to the set A ; 



(24) 
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B max " max ( Bi J ) : over al) 0. j) belonging to the set B ; (25) 

5 B mln = min ( Bij ); over all (i j) belonging to the set B ; (26) 

Then the overall outer ring contrast is computed from the above values: 

10 A rin 9 = max ( A max , B max ) - mm ( A mjn B mjn ) ; (27) 

[01 50] The value of A ring is next tested to see if there is any contrast for the outer ring. Regardless of the center pixel 
value, the output is set to zero (not an extremum point) if the value of A rj is less than or equal to a pre-defined small 
threshold T2: 

If (A rlng < T2 ) . return (0) ; (28) 

If, on the other hand, there is sufficient activity in the outer ring (as indicated by outer ring contrast > T2 ), then two 
tests are made to see if the center pixel value is at an extreme value relative to the outer ring values. The center pixel 
value X is defined to be at a peak if it is (significantly) larger than the maximum pixel value of either set: 

,f [( A max + S < X) AND (B max < X)], return (1 ) ; (29) 
where S is the outer ring contrast, scaled by a contrastscaling parameter C: 

S = ^ring 1 c I (30) 

[0151] In one embodiment, the contrast scaling parameter C is set to equal to 8. The actual value of the scaling 
parameter C is a function of the signal noise level at the input. It is desirable to maintain the value of C as a generic 
parameter of the Min-Max Detection units. The value of C can be limited to a power-of-two number so that it could be 
implemented as an arithmetic shift to save the need to perform a division operation per pixel. 

[0152] Similarly, the center pixel value X is defined to be at a valley if it is (significantly) smaller than the minimum 
pixel value from either set A or B: 

lf K A min > X + S) AND (B min > X)], return (1) ; (31) 
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[0153] Equations (29) and (31) determine the two conditions where the output from the 3x3 detection window are 
set to 1 ; in all other cases the output will be set to 0. 

[0154] In a second embodment, the center pixel value X is defined to be at a peak if it is (significantly) larger than 
45 the maximum pixel value of either set: 

lf t( A max + n th < x ) AN D (B max < X)], return (1); (29A) 
so where is the noise threshold defined as: 

N TH = NoiseBias + (NoiseFactor x X) / 256 

55 where the NoiseBias and the NoiseFactor are tuning parameters. 

[0155] Similarly, the center pixel value X is defined to be at a valley if it is (significantly) smaller than the minimum 
pixel value from either set A or B: 
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" [(A mjn > X 4- N TH ) AND (B min - X)l, return (1) ; (31A) 

[0156] Equations (29A) and (31 A) determine the two conditions where the output from the 3x3 detection window are 
5 set to 1 ; in all other cases the output will be set to 0. Note that in this second embodoment, there is no need to compute 
the overall outer ring contrast. 

[0157] Finally, each color component is processed independently through its own separate 3x3 window. The three 
binary outputs of the color components are then added together to form the final 2-bit output of the Min-Max Detection 
module. 

10 [0158] The two Min-Max Detection outputs C2 and D2 are fed to the Cascaded Filter Chain C3-C6 and D3-D6, 
respectively. The first filtering units, C3 and D3 are different between the two chains, but other than that, the subsequent 
units C4-C6 and D4-D6 are all identical. 

[0159] The first Filter unit C3 receives the 2-bit output from the high-resolution Min-Max detection unit C2. The input 
is filtered through the F_7/4 filter, which, in one embodiment, is a 7*7 symmetric, triangular, and separable filter. The 
15 /A notation indicates that filter F_7/4 also sub-samples the filtered output by a factor of 4 in both direction. This means 
that the filter F_71 4 only produces one output pixel for every fourth input pixel and every fourth line, thereby effectively 
reducing the data bandwidth by a factor of 1 6. 

[0160] Since the input to the first filtering unit C3 is limited to 2-bits (rather than 8 bits), the output of the filter is 
normali7ed by scaling the result by a different power-of-two number (e.g., 2). The scaling power should remain a design 
20 parameter. After the first normalization, however, the result has been scaled to fit the range of 8-bits, so that subsequent 
filtering is utilizing the 8-bit representation system from this point forward. 

[0161] The second chain leading filter unit D3 is different than C3 in two aspects. First, the F 5/2 filter only sub- 
samples the input by a factor of 2 (rather than 4) in each direction. This means that the filter only produces one output 
pixel for every other input pixel and every other line, thereby effectively reducing the data bandwidth by a factor of 4. 

25 Since the sub-sampling factor is smaller, the span of the filter can consequently be reduced from 7 (for C3) to 5 (for 
D3). In one embodiment, the normalization factor for the F_5/2 filter was determined to be 2 9 . Note that the (8-bit wide 
now) outputs from the two leading filter units C3 and D3 are both at the same resolution - sub-sampled by 4 in both 
dimension, or 1/16 th of the original input bandwidth. This is because the F_7/4 filter of the C3 unit in the upper chain 
is sub-sampling the data by 4, while the combination of the SS/2 and F_5/2 unit D3 in the lower chain results in an 

30 overall output rate that matches the output rate of C3. 

[0162] The two outputs from the filtering units C3 and D3 are further filtered through the three additional and identical 
units C4-C6 and D4-D6 respectively. Each of the six filtering units processes its respective input signal with an F_3/2 
filter (with coefficients 1-2-1) which also sub-samples the data by a factor of 2 in both directions. Note that each of 
these filters has a total weight of 1+2+1=4, thus the implementation can be simplified by replacing the normalization 

35 division with a simple arithmetic shift right by 2. 

[0163] Since each of the filtering units is also sub-sampling its respective input signal by a factor of 2, the signals at 
the respective outputs of the C6 and D6 filter units are in effect sub-sampled by a factor of 32 in each direction (or 
1024 times reduced in bandwidth). 

[0164] The next two filter units C7 and D7 are special filter units, denoted as a Fz_5 filter. The z suffix indicates a 
40 filter that eliminates any zero terms from the total normalization weight. The general Fz_n filter equation is as following: 

output value - ^V^) = (32) 
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where a^ are the 2D input values, w^ are the 2D filter coefficients, and 5^) is the function defined as: 

6(a^ = 1 if a^ 0 ; otherwise = 0 ; (33) 



[0165] As can be seen from equation (32), the difference from a regular filter is in that the total weight is no longer 
just a known normalization constant. Since the number of incoming pixels having the value of zero is not known in 
advance, an ongoing accumulator for the weight must be kept. At the same time the filter loop is operating, the content 
of the accumulator is increased by the value of the current filter coefficient if the corresponding input value is non zero. 
55 [01 66] Since the total weight is not fixed and known in advance, the final normalization of the filter output is dependent 
on the value of the total weight. However, it is still possible to avoid the division operation in equation (32) by using a 
predetermined multiplication table with multiple choices for the possible total weight values. 
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[0167] The purpose of using the special Fz_5 filters is to obtain reliable frequency and halftone weight estimates 
even when the filter gets very close to an edge. 

[0168] The two MX_5 modules C8 and D8 search for the maximum value in a 5x5 window and output the maximum 
value. 

[0169] Each of the two Interpolation modules C9 and D10 interpolates (i.e., up-samples) the signal back by a factor 
of'32 to restore it to the original resolution. Each interpolation unit is performing bilinear interpolation, essentially gen- 
erating 32*32 pixels for each 4 original pixels. The step size for the bilinear interpolation is 1/32 nd of the original pixel 
grid. 

[0170] The Halftone Weight module HTW receives as input the outputs of the two interpolation units denoted as FRQ 
and FRQ_B. The Halftone Weight module adds togetherthe contribution from each of the input as following: 



HTW = HTW H + HTW L ; (34) 

15 where 

HTW H = (FRQ - T H ) * SF H if FRQ > T H ; 0 otherwise ; (35) 



HTW L = (FRQ_B - T L ) * SF L if FRQ_B >T L ; 0 otherwise ; (36) 

where T H and T L are two predetermined thresholds, and SF H and SF L are two predetermined scale factors for 
the original (high) and filtered (low) frequencies FRQ and FRQ_B respectively. 

[0171] An additional logic ensures that the value of HTW is limited to never exceed the allowed 8-bit range of [0, 255]. 
[0172] FIG. 20 is a diagram illustrating the equations (34), (35), (36) and the clipping effect of the additional logic 
which limits the value of HTW to the allowed range. The region denoted as "LA" represents the line-art region. As 
illustrated in FIG. 20, one particular color screen pattern can change from the location indicated as HFHT to MFHT to 
LFHT as its frequency is changed from high to medium to low. Since the curve illustrated by the loci on the 2D plot is 
convex, it is not possible to distinguish the screen frequency by observing either FRQ or FRQ_B alone. 
[0173] In the description above, elements of an embodiment of the invention may be implemented by hardware, 
firmware, software or any combination thereof. The term hardware generally refers to an element having a physical 
structure such as electronic, electromagnetic, optical, electro-optical, mechanical, electro-mechanical parts, etc. The 
term software generally refers to a logical structure, a method, a procedure, a program, a routine, a process, an algo- 
rithm, a formula, a function, an expression, etc. The term firmware generally refers to a logical structure, a method, a 
procedure, a program, a routine, a process, an algorithm, a formula, a function, an expression, etc that is implemented 
or embodied in a hardware structure (e.g., flash memory, read-only memory (ROM), erasable ROM). Examples of 
firmware may include microcode, writable control store, microprogrammed structure. When implemented in software 
or firmware, the elements of an embodiment of the present invention are essentially the code segments to perform the 
necessary tasks. The softwaretfirmware may include the actual code to carry out the operations described in one 
embodiment of the invention, or code that emulates or simulates the operations. The program or code segments can 
be stored in a processor or machine accessible medium or transmitted by a computer data signal embodied in a carrier 
wave, or a signal modulated by a carrier, over a transmission medium. The "processor readable or accessible medium" 
or "machine readable or accessible medium" may include any medium that can store, transmit, or transfer information. 
Examples of the processor readable or machine accessible medium include an electronic circuit, a semiconductor 
memory device, a read-only memory (ROM), a flash memory, an erasable ROM (EROM), a floppy diskette, a compact 
disk (CD) ROM, an optical disk, a hard disk, a fiber optic medium, a radio frequency (RF) link, etc. The computer data 
signal may include any signal that can propagate over a transmission medium such as electronic network channels, 
optical fibers, air, electromagnetic, RF links, etc. The code segments may be downloaded via computer networks such 
as the Internet, Intranet, etc. The machine accessible medium may be embodied in an article of manufacture. The 
machine accessible medium may include data that, when accessed by a machine, cause the machine to perform the 
operations described in the following. The machine accessible medium may also include program code embedded 
therein. The program code may include machine readable code to perform the operations described in the above 
description of the invention . The term "code" here refers to any type of information that is encoded for machine-readable 
purposes. Therefore, it may include program, code, data, file, etc. 

[01 74] All or part of an embodiment of the invention may be implemented by hardware, software, or firmware, or any 
combination thereof. The hardware, software, or firmware element may have several modules coupled to one another. 



21 

BNSDOCID: <EP 1 380992A1 _!_> 



EP 1 380 992 A1 



A hardware module is coupled to another module by mechanical, electrical, optical, electromagnetic or any physical 
connections. A software module is coupled to another module by a function, procedure, method, subprogram, or sub- 
routine call, a jump, a link, a parameter, variable, and argument passing, a function return, etc. A software module is 
coupled to another module to receive variables, parameters, arguments, pointers, etc. and/or to generate or pass 

5 results, updated variables, pointers, etc. A firmware module is coupled to another module by any combination of hard- 
ware and software coupling melhods above. A hardware, software, or firmware module may be coupled to any one of 
another hardware, software, or firmware module. A module may also be a software driver or interface to interact with 
the operating system running on the platform. A module may also be a hardware driver to configure, set up, initialize, 
send and receive data to and from a hardware device. An apparatus or system may include any combination of hard- 

'£> ware, software, and firmware modules. 

[0175] One embodiment of the invention may bo described as a method or process which is usually depicted as a 
flowchart, a flow diagram : a structure diagram, or a block diagram. Although any of these depictions may describe the 
operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, 
the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process 

15 may correspond to a method, a program, a procedure, a method of manufacturing or fabrication, etc. 



Claims 

20 1. A method for separating an image signal into a set of image planes, the method comprising the operations of: 

(a) searching, via a min-max module, for a minimum and a maximum within at least one window centered on 
a current pixel in the image signal; 

(b) computing, via a dynamic threshold module, for the at least one window, based on the respective minimum 
25 and maximum received from the min-max module and the current pixel, a respective indicator representing 

the distance and direction of the current pixel relative to a respective threshold plane, and outputting a control 
signal based on the indicator, and 

(c) separating the image signal into the set of image planes in accordance with the control signal by including 
a representation of the current pixel in at least one of the image planes, via a separation module. 

30 

2. The method of Claim 1 wherein the image signal comprises a luminance signal, a first chroma signal and a second 
chroma signal, and wherein operation (a) comprises the operations of: 

searching for a luminance minimum and a luminance maximum in the luminance signal within the window; 
35 indexing locations of the luminance maximum and luminance minimum within the window; and 

outputting the luminance minimum, the luminance maximum, and values in the first and the second chroma 
signals that correspond to the locations of the luminance minimum and luminance maximum. 

3. The method of Claim 2 wherein the operations of searching, indexing and outputting are performed for each window 
*o jn a set of windows. 

4. The method of Claim 1 wherein operation (a) comprises the operations of: 

searching for a first minimum and a first maximum within a first window centered on the current pixel in the 
45 image signal, using a first min-max block, the first window corresponding to a first context; and 

searching for a second minimum and a second maximum within a second window centered on the current 
pixel in the image signal, using a second min-max block, the second window corresponding to a second con- 
text; 

50 wherein the second context is substantially larger than the first context. 

5. A system for separating an image signal into a set of image planes, the system comprising: 

(a) a min-max module receiving the image signal, searching for a minimum and a maximum within at least 
55 one window centered on a current pixel in the image signal; 

(b) a dynamic threshold module, in communication with the min-max module, computing, for, the at least 
window, based on the respective minimum and maximum received from the min-max module and the current 
pixel, a respective indicator representing the distance and direction of the current pixel relative to a respective 



22 



3NSDCCID: <EP 1380992A1J _> 



EP 1 380 992 A1 

threshold plane, and outputting a control signal based on the indicator; and 

(c) a separation module, in communication with the dynamic threshold module, separating the image signal 
into the set of image planes in accordance with the control signal by including a representation of the current 
pixel in at least one of the image planes. 

The system of Claim 5 wherein the image signal comprises a luminance signal, a first chroma signal and a second 
chroma signal, and wherein the min-max module: 

searches for a luminance minimum and a luminance maximum in the luminance signal within the window; 

indexes locations of the luminance maximum and luminance minimum within the window; and 

outputs the luminance minimum, the luminance maximum, and values in the first and the second chroma 
signals that correspond to the locations of the luminance minimum and the luminance maximum. 

The system of Claim 6 wherein the min-max module searches for a minimum and maximum within each of a set 
of windows centered on the current pixel in the image signal. 

The system of Claim 5 wherein the min-max module comprises: 

a first min-max block searching for a first minimum and a first maximum within a first window centered on the 
current pixel in the image signal, the first window corresponding to a first context; and 

a second min-max block searching for a second minimum and a second maximum within a second window 
centered on the current pixel in the image signal, the second window corresponding to a second context; 

wherein the second context is substantially larger than the first context. 
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